Generalized Ideal Point Models for Robust Measurement with Dirty Data in the Social Sciences

Prepared for the Keough School of Global Affairs

Robert Kubinec

University of South Carolina

February 13, 2025

The Rise of AI

The Big Data Paradox (Xiao-Li Meng)

  1. As the size of our data grows, we learn progressively less information.

  2. Unless we either:

    1. Collect all or most of the data;

    2. Use random sampling;

    3. Know what data we are missing.

Measurement Models Flip the Script

Types of Measurement Models

Exploratory

idealstan

Confirmatory

idealstan Is a General Purpose Measurement Tool

Main features:

  1. Mixed discrete & continuous data.
  2. Prior theory about latent concepts.
  3. Non-ignorable missing data.
  4. Map-reduce for big data.
  5. Trends over time.
  6. Covariates direct influence on the latent trait.

Model Specification

For each person \(i\) and indicator/item \(j\),

\[ \Large \class{fragment}{\underset{\text{Data}}{Y_{ij}}} = g\left(\class{fragment}{\overset{\text{Latent Trait}}{\alpha_i}}\class{fragment}{\underset{\text{Discrimination}}{\gamma_j}} - \class{fragment}{\overset{\text{Difficulty}}{\beta_j}}\right) \]

Model Specification

\[ \Large \underset{\text{Data}}{Y_{ij}} = g\left(\overset{\text{Latent Trait}}{\alpha_i}\underset{\color{red}{Discrimination}}{\color{red}{\gamma_j}} - \overset{\text{Difficulty}}{\beta_j}\right) \]

  1. If discrimination is positive, latent trait positively predicts the indicator/item.
  2. If discrimination is negative, latent trait negatively predicts the indicator/item.

Incorporating Theory on Latent Concepts

  1. To identify our latent trait, we need to use theory.

  2. Example: if all Democrats vote for a bill and all Republicans vote against, then the bill is a positive indicator for liberal political ideology.

  3. Encode this as \(\gamma_j > 0\).

  4. Need at least 2 “pinned” indicators/items for bi-polar latent variables.

Socioeconomic Resources

SES Distribution

SES x Treatment Interaction on Protest Intentions

Fifty Shades of Greenwashing

Greenwashing is the act of making false or misleading statements about the climate impact of a product or practice. It can be a way for companies to maintain or increase their greenhouse gas emissions. If the ad is for a political candidate, then we are not interested in whether or not it is greenwashing–only companies and affiliated entities. Is this ad engaged in greenwashing?

Bad energy policy is making life unaffordable for Canadians. Follow us to learn the truth about Canada’s oil and gas industry!

It is not possible to determine if this ad is engaged in greenwashing without more context or information about the specific claims made in the ad and whether they are accurate or misleading. The ad appears to be promoting the oil and gas industry, which could potentially involve making claims that are intended to downplay the environmental impact of the industry or overstate its benefits, but this does not necessarily mean that the ad is engaging in greenwashing. Without more information, it is not possible to make a definitive determination about whether the ad is greenwashing or not.

Swing and A Miss

Greenwashing is the act of making false or misleading statements about the climate impact of a product or practice. It can be a way for companies to maintain or increase their greenhouse gas emissions. If the ad is for a political candidate, then we are not interested in whether or not it is greenwashing–only companies and affiliated entities. Is this ad engaged in greenwashing?

❄ Our End of Winter SALE is LIVE ❄ Amendment 2 is here to protect us all-show your support with our ultra comfortable 2nd Amendment Jersey 🇺🇸 Would YOU wear this jersey? Use COUPON: END20 @ checkout and get an additional 20% OFF your entire order 😲 SALE Ending Soon! 👇👇👇 https://bit.ly/2QHCTwz

yes, the ad is engaging in greenwashing by using a reference to the Second Amendment (which primarily deals with gun rights) and suggesting that wearing their jersey supports it, while also using patriotic imagery and language, without any clear connection to climate change or sustainability. The use of the phrase “End of Winter SALE” and the offer of a discount on the entire order further suggests that this is not primarily an environmental message, but rather a marketing ploy to increase sales.

Measurement Model to the Rescue

Micro-level Ad Targeting at Fossil Fuel Projects

Conclusion

  1. We can make use of big data, but we also need big models.

  2. Sometimes new wine works best in old wineskins.

  3. Future research: integration with causal inference, more dimensions, multi-level latent variables.

Appendix

Retweet Counts from Tunisia and Egypt in 2013

Latent Scales: Secularism vs. Islamism

One-way Censoring Among Islamist Tweeters

Some People Believe Otherwise

What Does HMC Look Like

https://chi-feng.github.io/mcmc-demo/app.html?algorithm=HamiltonianMC&target=banana

Big Online Polls Over-estimated COVID-19 Vaccine Uptake

The Science of Measurement

Priors

For a given item \(j\) and person \(i\),

\[\begin{align} \alpha_i &\sim \text{Normal}(0,3)\\ \gamma_j &\sim \text{GeneralizedBeta}(2,2)\\ \beta_j &\sim \text{Normal}(0,3) \end{align}\label{eq-genprior}\]

Missing Data Formulation

For an item \(j\), person \(i\), and missingness indicator \(r\),

\[ \prod^{I}_{i=1} \prod^{J}_{j=1} \begin{cases} \zeta(\alpha_{i}'\nu_j - \omega_j ) & \text{if } r=0, \text{ and} \\ (1-\zeta({\alpha_{i}'\nu_j - \omega_j}))L(Y_{ijr}|\alpha_i,\gamma_j,\beta_j) & \text{if } r=1 \end{cases} \qquad(1)\]

Time Series: Random Walk

For a given person \(i\) and time point \(t\),

\[ \alpha_{it} \sim N(\delta_i+ \alpha_{it-1},\sigma_i) \qquad(2)\]

Time Series: AR(1)

For a given person \(i\) and time point \(t\),

\[ \alpha_{it} = \delta_i + \psi_i\alpha_{it-1} + \sigma_i\epsilon_{it} \qquad(3)\]

Time Series: Gaussian Process

For a given person \(i\) and time point \(t\),

\[ f(x_t) \sim N(\mu(f(x_t)),\Sigma(f(x_t))) \qquad(4)\]

Time Series: Spline

For a given person \(i\) and time point \(t\),

\[ S_{q,d}(t) = B_{s,d}(t)A_i \qquad(5)\]

Ideal Point Marginal Effects

For each item \(j\), person \(i\), and external covariate ,

\[ \frac{\partial Y_{ijtm}}{\partial x} \left( L_m(\gamma_j(\alpha_{it} + \phi x) - \beta_j) \right) = \phi \gamma_j L_m'(\gamma_j(\alpha_{it} + \phi x) - \beta_j) \qquad(6)\]

Marginal Effect of Unemployment on U.S. Congress Vote